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BACKGROUND 

1. Field of Invention 

This invention relates generally to digital logic and 
specifically to carry-look ahead adders. 

2. Description of Related Art 

FIG. 1 is a block diagram of a conventional 12 -bit carry 
look-ahead adder 100 having three processing stages in which 
input signals A [11:0] and B[11:0] are logically combined to 
generate a 12 -bit sum signal S[11:0] and a carry-out bit C out . 
In the first stage, groups of three input signal pairs are 
combined in conventional 3 -bit propagate circuits (P3) 200 and 
3 -bit generate circuits (G3) 300 to produce well-known carry- 
propagate P[z-*x] and carry-generate signals G[z->x], 
respectively. Specifically, each carry-propagate circuit 200 
logically combines three bit-pairings of the input signals 
A[z:x] and B[z:x] to generate its carry-propagate signal 
P[z^x] according to the well-known logical expression P[z-*x] = 
(Ax + Bx) | (Ay + Bz) | (Az + Bz) , where | denotes the logical AND 
operation and + denotes the logical OR operation. The groups 
of three bit-pairings A[z:x] and B[z:x] are also logically 
combined in G3 circuits 300, each of which generates its 
carry-generate signal G[z-*x] according to the well-known 
logical expression G[z^x] = Az|Bz + (Az + Bz) | [Ay | By 4- (Ay + 
By) | (Ax|Bx) ] . 

In order to maximize speed, P3 circuits 200 and G3 
circuits 300 are typically implemented using dynamic logic as 
shown, for example, in FIGS. 2 and 3, respectively, where PMOS 
pull-up transistor MP1 and NMOS pull -down transistor MN1 are 
each responsive to a clock signal CLK . Thus, when CLK is logic 
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low, transistor MP1 turns on and pulls node Nl high to V DD to 
set the output signal (e.g., P[z^x] or G[z->x]) to logic low 
via inverter INV1 , and transistor MN1 turns off to isolate 
node N2 from ground potential. When CLK transitions to logic 
high, transistor MP1 turns off and transistor MN1 turns on, 
thereby allowing input signals A[z:x] and B[z:x] to determine 
the logic state of the output signal P[z->x] or G[z-*x]. When 
CLK transitions back to logic low, the output signal is again 
returned to logic low via inverter INV1 and pull-up transistor 
MP1. 

The second stage of adder 100 includes well-known carry 
look-ahead (CLA) logic 400 that combines the carry-generate 
and carry-propagate signals provided by the first stage to 
simultaneously produce accumulated carry information at 3 -bit 
intervals. Specifically, the carry-generate and carry- 
propagate signals from respective G3 circuits 3 00 and P3 
circuits 2 00 are provided to and logically combined in carry 
look-ahead (CLA) logic 400 to simultaneously produce 
accumulated carry-generate signals G[2-»0], G[5->0], G[8-*0], and 
G[ll->0], where G[2-»0] represents the carry-out from the first 
3 bit positions 0 to 2, G[5-»0] represents the carry-out from 
the first 6 bit positions 0 to 5, G[8->0] represents the carry- 
out from the first 9 bit positions 0 to 8, and G[ll-*0] 
represents the carry-out from all 12 bit positions, and 
therefore also provides the carry-out bit C ou t for adder 100. 

CLA logic 400 includes well-known CLA blocks 410, 420, 
and 430, and in response to the carry-generate G[z-»x] and 
carry-propagate P[z-*x] signals, generates in parallel the 
accumulated carry-generate signals G[2-»0], G[5->0], G[8-»0], and 
G[ll->0], respectively. G[2->0] is generated by G3 circuit 300a, 
and may pass unmodified through CLA logic 400. CLA block 410 
generates G[5^0] according to the logical expression G[5->0] = 
G[5->3] + P[5-3] |G[2->0] . CLA block 420 generates G[8-0] 
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according to the logical expression G[8^0] = G[8->6] + 
P [8->6] I G[5^3] + P[8-6] |P[5->3] |G[2-0] . CLA block 430 generates 
G[ll-+0] according to the logical expression G[ll-»0] = G[ll->9] 
+ P[ll-9] |G[8-6] + P[ll->9] |P[8^6] |G[5-3] + 

P [ll->9] | P [8->6] | P [5->3] I G [2->0] . Exemplary circuit diagrams for 
CLA blocks 410, 420, and 430 implemented in dynamic logic are 
shown in FIGS. 4A, 4B, and 4C, respectively. 

The third stage of adder 10 0 includes conventional sum 
circuits 500 that together logically combine the accumulated 
carry information provided by the second stage CLA logic 4 00 
with the input signals A [11:0] and B[11:0] to generate the sum 
signal S[11:0]. Specifically, a grounded signal Ci n and the 
accumulated carry-generate signals G[2->0], G[5->0], and G[8-*0] 
are provided as carry-in signals to respective sum circuits 
500a-500d to generate corresponding 3 -bit groups of the sum 
signal in a well-known manner. For example, sum circuit 500a 
combines A [2:0], B[2:0], and a grounded (i.e., logic low) 
carry-in bit Ci n to generate sum bits S[2:0], sum circuit 500b 
combines A[5:3], B[5:3], and carry-in bit G[2^0] to generate 
sum bits S[5:3], sum circuit 500c combines A [8: 6], B[8:6], and 
carry-in bit G[5-*3] to generate sum bits S[8:6], and sum 
circuit 500d combines A [11: 9], B[ll:9], and carry-in bit 
G[8^6] to generate sum bits S[ll:9]. 

Typically, each sum circuit 500 generates well-known sumO 
and suml signals in response to the input signals A and B, and 
uses the carry-in bit (e.g., G[z->x]) to select between 
outputting either the sumO or suml bits to form the sum signal 
S. For example, FIG. 5 shows a conventional 3 -bit sum circuit 
500 including 3 -bit carry-ripple adders. Three sumO bits are 
generated by full adders 502a-502c in response to logical 
combinations of Ax and Bx, Ay and By, and Az and Bz, 
respectively, with a logic low (i.e., grounded) carry-in bit 
Ci n/ and three suml bits are generated by full adders 504a- 
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504c in response to logical combinations of Ax and Bx, Ay and 
By, and Az and Bz, respectively, with a logic high (i.e., tied 
to V DD ) carry-in bit C in . Multiplexers 506a-506c selectively 
output either the sumO or suml bits as respective sum bits Sx, 
Sy, and Sz in response to the logic state of the corresponding 
carry signal G[u->w]. Because the sumO and suml bits are 
generated before G[u-w] is available, the 3-bit carry-ripple 
adders of sum circuit 500 do not degrade performance of adder 
100. 

Although CLA adder 100 is much faster than carry-ripple 
adders, it would nevertheless be desirable to further improve 
its performance. For example, referring again to FIG. 2, P3 
circuit 2 00 includes two paths of three transistors connected 
in series between nodes Nl and N2 (i.e., transistors 201-203 
and transistors 204-206) , and thus has a stack height of 
three. Referring to FIG. 3, G3 circuit 300 includes a 
discharge path having four stacked input transistors 306-309 
connected in series between nodes Nl and N2 , and thus has a 
stack height of four. Because of the well-known body effect 
phenomenon, the addition of each stacked input transistor 
significantly reduces the switching speed of the corresponding 
logic circuit. As a result, because G3 circuit 300 has a stack 
height of four and P3 circuit 2 00 has a stack height of three, 
G3 circuit 300 is much slower than P3 circuit 200, and 
therefore determines the critical path of the adder 100. 
Accordingly, it would be desirable to reduce the stack heights 
of the first stage logic circuits 200 and 300 in order to 
increase performance. 

In addition, G3 circuit 3 00 includes one discharge path 
having four stacked input transistors 3 06-3 09 and another 
discharge path having only two stacked input transistors 3 01- 
302. Since the series resistance of the four transistors 306- 
3 09 is much greater than the series resistance of the two 
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transistors 301-302, transistors 306-309 are typically sized 
to be much larger than transistors 301-302 in order to 
maintain similar speeds for their respective discharge paths. 
However, increasing the size of transistors 306-309 in an 
effort to achieve balanced operation also increases parasitic 
capacitances, which in turn further reduces the speed of G3 
circuit 300. Increasing the size of transistors 306-309 also 
increases the input capacitance of circuit 3 00, which in turn 
undesirably loads circuitry (not shown) that provides input 
signals to circuit 300. Thus, it would also be desirable for 
an adder's first stage logic circuits to have better-balanced 
discharge paths. 

SUMMARY 

A method and apparatus are disclosed that increase the 
speed of carry look-ahead (CLA) adders by reducing the stack 
height of their first stage logic circuits. In accordance with 
the present invention, a CLA adder capable of adding (or 
subtracting) two input signals includes first stage logic 
having a plurality of carry-create and carry- transmit logic 
circuits each coupled to receive one or more bits of each 
input signal. Each carry-create circuit generates a novel 
carry-create signal in response to corresponding first bit- 
pairings of the input signals, and each carry-transmit circuit 
generates a novel carry-transmit signal in response to 
corresponding second bit-pairings of the input signals. The 
carry-create and carry- transmit signals are combined in CLA 
logic to generate accumulated carry-create signals, which are 
then used to select final sum bits. 

For one embodiment, each carry-create circuit is coupled 
to receive 3 bit -pairings of input signals A and B, and 
generates a corresponding carry- create signal according to the 
logical expression J[z-»x] = (Az|Bz) + (Ay|By) + (Ax|Bx), where 
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I represents the logical AND operation, + indicates the 
logical OR operation, and x, y, and z represent bit positions 
in the input signals A and B. The carry-create circuit 
implements three 2 -input AND terms, and thus has a stack 
height of two. Each carry- transmit circuit is coupled to 
receive 3 bit-pairings of input signals A and B, and generates 
a corresponding carry-transmit signal according to the logical 
expression T[z^x] = (Az + Bz) | [ (Ay + By) | (Ax + Bx) + (Ax + 
Bx) ] . The carry-transmit circuit implements a 3-input AND 
term, and thus has a stack height of three. By comparison, 
prior art carry-propagate and carry-generate circuits have 
stack heights of three and four, respectively. Thus, because 
the first stage carry-create and carry-transmit circuits of 
the present invention have lower stack heights than do prior 
art first stage carry-propagate and carry-generate circuits, 
adders that incorporate Applicant's first stage carry-create 
and carry-transmit circuits are faster than prior art adders 
that utilize conventional carry-propagate and carry-generate 
circuits . 

In addition, Applicant's carry-create and carry- transmit 
logic circuits have evenly balanced discharge paths. For one 
embodiment, each discharge path in the carry-create logic 
circuit includes two stacked input transistors, and each 
discharge path in the carry-transit logic circuit includes 
three stacked input transistors. As a result, Applicant's 
carry-create and carry-transmit logic circuits do not require 
transistor sizing adjustments to maintain balanced operation, 
which may result in an even greater performance advantage over 
prior art CLA adders . 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The features and advantages of the present invention are 
illustrated by way of example and are by no means intended to 
limit the scope of the present invention to the particular 
embodiments shown, and in which: 

FIG. 1 is a block diagram of a conventional 12 -bit carry 
look-ahead adder that produces carry information at 3 -bit 
intervals; 

FIG. 2 is a circuit diagram of a conventional 3 -bit 
carry-propagate circuit of the adder of FIG. 1; 

FIG. 3 is a circuit diagram of a conventional 3 -bit 
carry-generate circuit of the adder of FIG. 1; 

FIGS. 4A-4C are circuit diagrams of conventional CLA 
blocks of the adder of FIG. 1; 

FIG. 5 is a block diagram of a conventional sum circuit 
of the adder of FIG. 1; 

FIG. 6 is a block diagram of a 12 -bit carry look-ahead 
adder that produces carry information at 3 -bit intervals in 
accordance with the present invention; 

FIG. 7 is a circuit diagram of one embodiment of a 3 -bit 
carry-create circuit of the adder of FIG. 6; 

FIG. 8 is a circuit diagram of one embodiment of a 3 -bit 
carry- transmit circuit of the adder of FIG. 6; 

FIG. 9 is a circuit diagram of one embodiment of a carry 
translation circuit of the adder of FIG. 6; 

FIGS. 10A-10C are circuit diagrams of one embodiment of 
CLA blocks of the adder of FIG. 6; 

FIG. 11 is a block diagram of one embodiment of a sum 
generator of the adder of FIG. 6; and 

FIG. 12 is a circuit diagram of one embodiment of the sum 
generator of FIG. 11. 

Like reference numerals refer to corresponding parts 
throughout the drawing figures. 
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DETAILED DESCRIPTION 

Present embodiments are discussed below in the context of 
a 12 -bit adder 100 for simplicity only. It is to be understood 
that present embodiments are equally applicable to adders that 
5 combine input signals of other various bit lengths. Further, 
although described below in the context of dynamic logic, 
embodiments of the present invention may be implemented in 
static logic. Also, the specific configurations of logic 
circuits disclosed for implementing various logical 
10 expressions described in accordance with the present invention 
may be modified as desired. In addition, adders of the present 
invention may be readily used to perform arithmetic 

H subtraction operations. Accordingly, the present invention is 

p 

'f% not to be construed as limited to specific examples described 

3^ 15 herein but rather includes within its scope all embodiments 
hj defined by the appended claims. 

W FIG. 6 is a block diagram of one embodiment of a 12 -bit 

#.?.p 

s bit carry look-ahead (CLA) adder 600 in accordance with the 

H present invention. Adder 600 is shown in FIG. 6 and described 

14 2 0 herein as logically combining first and second 12 -bit input 

signals A [11:0] and B[11:0] to produce a 12 -bit sum signal 
iy S[11:0] and a carry-out bit C out . The carry-in bit C in to adder 

600 is tied to ground potential to indicate that there is no 
carry-in bit. Adder 600 includes three stages of processing. 
25 The first stage includes four 3 -bit carry-create (J3) circuits 
700a-d, three 3-bit carry- transmit (T3) circuits 800a-c, and 
four 2-bit carry translation (T2) circuits 850a-850d. The 
second stage includes CLA logic 900. The third stage includes 
four 3 -bit sum generators 602a-602d and a logic gate 604. 
30 T2 circuits 850 and T3 circuits 800 are shown as separate 

logic elements in the block diagram of FIG. 6 for clarity 
only. As explained below, T3 circuits 800 and T2 circuits 850 
include similar logic. Thus, in some embodiments, T2 circuits 
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850a-850c may be incorporated within T3 circuits 800a-800c, 
respectively, in order to eliminate duplicative logic and 
thereby reduce overall transistor count, which in turn 
advantageously reduces both silicon area and power 
consumption. 

In accordance with the present invention, the first stage 
J3 circuits 700 and T3 circuits 800 generate a plurality of 
carry-create signals J and carry-transmit signals T, 
respectively, using the input signals A and B in a novel 
manner that improves performance compared to the generation of 
conventional first stage carry-generate and carry-propagate 
signals used, for example, in the prior art adder 100 of FIG. 
1. T2 circuits 850 generate carry translation signals CT using 
the input signals A and B. The second stage CLA logic 900 
logically combines the carry-create and carry-transmit signals 
to produce a number of accumulated carry- create signals that 
represent carry information at 3 -bit intervals. For one 
embodiment, the second stage CLA logic 900 is conventional. 
The third stage sum generators 602 combine input signals A and 
B with the accumulated carry-create signals J and carry 
translation signals CT to produce the sum signal S. 

Specifically, first groups of three bit-pairings of the 
input signals are logically combined in J3 circuits 700 to 
generate carry-create signals J[2-»0], J[5->3], J[8->6], and 
J [11-^9] • For example, input signal bits A[2:0] and B[2:0] are 
combined in J3 circuit 700a to generate J[2-»0], input signal 
bits A [5: 3] and B[5:3] are combined in J3 circuit 700b to 
generate J[5->3], input signal bits A[8:6] and B[8:6] are 
combined in J3 circuit 700c to generate J[8->6], and input 
signal bits A[ll:9] and B[ll:9] are combined in J3 circuit 
700d to generate J[ll->9]. Each J3 circuit 700 generates its 
carry-create signal J according to the logical expression 
J[z->x] = (Az|Bz) + (Ay | By) + (Ax|Bx), where | represents the 
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logical AND operation, + indicates the logical OR operation, 
and x, y, and z represent bit positions in the input signals A 
and B. A circuit diagram of one embodiment of J3 circuit 700 
implemented in dynamic logic is shown in FIG. 7. 

Second groups of three bit -pairings of the input signals 
are logically combined in T3 circuits 800 to produce carry- 
transmit signals T[3->1], T[6->4], and J[9->7], For example, 
input signal bits A [3:1] and B[3:l] are combined in T3 circuit 
800a to generate T[3->1], input signal bits A [6: 4] and B[6:4] 
are combined in T3 circuit 800b to generate T[6->4], and input 
signal bits A [9: 7] and B[9:7] are combined in T3 circuit 800c 
to generate T[9-*7]. Each carry- transmit circuit 800 generates 
its carry- transmit signal T according to the logical 
expression T[z->x] = (Az + Bz) | [(Ay 4- By) | (Ax + Bx) + 
(Ay | By)] . A circuit diagram of one embodiment of T3 circuit 
800 implemented in dynamic logic is shown in FIG. 8. 

Note that the first groups of input signal bit -pairings 
(which are combined in J3 circuits 700) are different from the 
second groups of input signal bit -pairings (which are combined 
in T3 circuits 800) . For example, while the first groups of 
bit-pairings respectively include bits 0-2, 3-5, 6-8, and 9- 
11, the second groups of bit-pairings respectively include 
bits 1-3, 4-6, and 7-9. 

Third groups of two bit-pairings of the input signals are 
logically combined in T2 circuits 850 to produce carry 
translation signals CT[2-*1], CT[5->4], CT[8->7], and CT [11-10]. 
For example, input signal bits A [2:1] and B[2:l] are combined 
in T2 circuit 850a to generate carry translation signal 
CT[2->1], input signal bits A [5: 4] and B[5:4] are combined in 
T2 circuit 850b to generate carry translation signal CT[5— 4], 
input signal bits A [8: 7] and B[8:7] are combined in T2 circuit 
850c to generate carry translation signal CT[8— 7], and input 
signal bits A [11: 10] and B[ll:10] are combined in T2 circuit 
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850d to generate carry translation signal CT[11->10]. Each T2 
circuit 850 generates its carry translation signal CT 
according to the logical expression CT[y^x] = (Ay + By) | (Ax + 
Bx) + Ay | By. A circuit diagram of one embodiment of a dynamic 
logic implementation of T2 circuit 850 is shown in FIG. 9. 

As mentioned above, T2 circuits 850 share common logic 
with T3 circuits 800, and therefore may be incorporated into 
T3 circuits 800. Referring also to FIG. 8, transistors 802- 
803, 805-806, and 807-808 of T3 circuit 800 perform the 
identical logic function as transistors 851-852, 853-854, and 
855-856 of T2 circuit 850, and therefore the signal CT [y->x] 
may be taken at node N3 of T3 circuit 800 rather than being 
generated in a separate T2 circuit 850. Thus, referring again 
to FIG. 6, T3 circuit 800a may provide CT[2->1], T3 circuit 
800b may provide CT[5->4], and T3 circuit 800c may provide 
CT[8-*7]. In this manner, T2 circuits 850a-850c may be 
eliminated from the embodiment of FIG. 6. 

The carry-create signals J and carry- transmit signals T 
produced by respective J3 circuits 700 and P3 circuits 800 are 
provided to and combined in second stage carry look-ahead 
(CLA) logic 900 to simultaneously produce accumulated carry- 
create signals at 3 -bit intervals, i.e., J[2->0], J[5-*0], 
J[8->0], and J[ll->0], where J[2-*0] represents carry information 
for the first 3 input signal bit -pairings 0 to 2, J [5->0] 
represents carry information for the first 6 input signal bit- 
pairings 0 to 5, J [8->0 ] represents carry information for the 
first 9 input signal bit-pairing 0 to 8, and J [ll->0 ] 
represents carry information for all 12 input signal bit- 
pairings . 

CLA logic 900 includes CLA blocks 910, 920, and 930, and 
operates to simultaneously generate the accumulated carry- 
create signals J[2-»0], J[5-0], J[8->0], and J[ll->0] in response 
to the carry-create and carry- transmit signals provided by 
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first stage J3 circuits 700 and T3 circuits 800. For one 
embodiment, accumulated carry-create signal J [2—0], which is 
generated by J3 circuit 700a, may pass unmodified through CLA 
logic 900. In other embodiments, CLA logic 900 may generate 
J [2-0] internally. 

CLA block 910 logically combines J[2— 0], J[5-3], and 
T[3— 1] to generate J [5—0] according to the logical expression 
j[5-0] = J[5-3] + T [3-1] | J[2-0] . For some embodiments, CLA 
block 910 may be conventional CLA block 410 used in the prior 
art adder 10 0 of FIG. 1. A circuit diagram of one embodiment 
of a dynamic logic implementation of CLA block 910 is shown in 
FIG. 10A. 

CLA block 92 0 logically combines J [2-0], J [5-3], J [8-7], 
T[3-l], and T [ 6— 4] to generate signal J[8— 0] according to the 
logical expression J [8-0] = J[8-6] + T [6-4] i J [5-3] + 
T[6-4] |T[3-1] |J[2-0] . For some embodiments, CLA block 920 may 
be conventional CLA block 42 0 used in the prior art adder 100 
of FIG. 1. A circuit diagram of one embodiment of a dynamic 
logic implementation of CLA block 920 is shown in FIG. 10B. 

CLA block 93 0 logically combines J [2-0], J [5-3], J [8-7], 
J [11-9], T[3-l], T[6-4], and T[9-7] to generate signal J[ll-0] 
according to the logical expression J [11—0] = J [11— 9] + 
T[9-7]|J[8-6] + T[9-7] |T[6-4] |J[5-3] + 
T[9-7] |T[6-4] | T [3—1] | J [2—0] . A circuit diagram of one 
embodiment of a dynamic logic implementation of CLA block 93 0 
is shown in FIG. IOC. 

The third stage sum generators 602 logically combine 
three corresponding bit -pairings of the input signals A[z:x] 
and B[z:x] with corresponding accumulated-carry signals J[z— x] 
and carry translation signals CT[y-x] to generate the bits of 
the sum signal. For example, sum generator 602a logically 
combines A [2:0] and B[2:0] (with grounded carry information) 
to generate sum bits S[2:0], sum generator 602b logically 
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combines A [5 : 3] and B[5:3] with carry information indicated by 
J [2-0] and CT[2-1] to generate sum bits S[5:3], sum generator 
602c logically combines A [8: 6] and B[8:6] with carry 
information indicated by J [5-0] and CT[5-4] to generate sum 
5 bits S[8:6], and sum circuit 602d logically combines A [11: 9] 
and B[ll:9] with carry information indicated by J [8-0] and 
CT[8-7] to generate sum bits S[ll:9]. Signals J[ll-0] and 
CT [11-10] are logically combined in logic gate 604 to generate 
the carry-out bit C out . For one embodiment, logic gate 6 04 is a 
10 well-known AND gate. 

FIG. 11 shows a sum generator 1100 that is one embodiment 
of sum generator 602 of FIG . 6. Sum generator 1100 includes a 
u sum circuit 1102, translation logic 1104, and a MUX 1106. Sum 

Q circuit 1102 adds 3 corresponding bits of input signals A and 

S 15 B to generate 3 -bit pre- sum signals PSUM1 and PSUM0, where 

pi PSUM1 assumes a logic high carry- in bit and PSUM0 assumes a 

W 

05 logic low carry- in bit. Sum circuit 1102 is well-known, and 

may include two 3 -bit carry-ripple adders to generate PSUM1 

Q and PSUM0. For one embodiment, sum circuit 1102 may include 

f\ 20 the configuration of full adders 502a-503c and 504a-504c of 

|4 prior art sum circuit 5 00 of FIG. 5, where the suml and sumO 

it signals of FIG. 5 correspond to the PSUM1 and PSUM0 signals, 

! Ir- 
respectively, of FIG. 11. 

Signals PSUM1 and PSUM0 are provided to translation logic 

25 1104, which in turn uses the corresponding carry translation 

signal CT to convert pre-sum signals PSUM1 and PSUM0 into sum 

signals SUM1 and SUMO, respectively. For one embodiment, 

translation circuit 1104 generates SUM1 according to the 

logical expression SUM1 = PSUM1 | CT and generates SUMO 

30 according to the logical expression SUMO = PSUM0|CTB, where 

CTB is formed by logically complementing the input signals to 

the T2 circuits 850, i.e., CTB [A, B] = CT [ A , B ] . Thus, for 
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example, CTB[y->x] = ( Ay + By ) | ( Ax + Bx ) + Ay \ By . 
Signals SUM1 and SUMO are provided as inputs to MUX 1106 
which, in response to the J signal, selects either SUM1 or 
SUMO to output as sum bits S. In this manner, translation 
logic 1104 and MUX 1106 generate each sum bit from its 
corresponding pre-sum bit according to the logical expression 
S = J | CT | PSUM1 + JB | CTB | PSUMO . 

Each of the secondary accumulated carry-create signals 
JB[ll-0], JB[8-0], JB[5-0], and JB[2-0] may be generated in a 
manner similar to that described above with respect to 
corresponding accumulated carry-create signals J [11—0], 
J [8-0], J [5-0], and J [2-0] where the A and B input signals are 
complemented before generating corresponding carry-create 
signals J[z-x] and carry-transmit signals T[z-x]. For example, 
CLA block 910 may generate JB[5-0] in response to JB[5-3], 

JB [2-0] , and TB [3-1] , where JB [5-3] = As | Bs + A* \ Ba + A3 \ Bs , 

JB[2-0] = I2 I Bl + Ai J B\ + Ao + Bo, and TB [3-1] = ( As + Bs ) | 

[(A2 + B2) I (A\ + Bi) + {Ai I B2) ] . In this manner, JB | CTB is 
the logical complement of J|CT. Note that where sum generator 
1100 is used as sum generator 602a of FIG. 6, the CT and J 
signals may be grounded so that SUMO is selected to provide 
sum bits S [2 : 0] . 

FIG. 12 shows one embodiment 12 0 0 of a dynamic logic 
implementation of translation circuit 1104 and MUX 1106 of sum 
generator 1100 of FIG. 11. Circuit 1200 includes a PMOS pull- 
up transistor 12 01 coupled between V DD and the output at which 
the sum bit S is provided. Circuit 12 00 also includes NMOS 
pull-down trarsistors 1206-1207 connected in series between 
the output and ground potential, and NMOS pull -down 
transistors 1208-1209 connected in series between the output 
and ground potential. Signal J is provided to the gate of 
transistor 12 06, and signal JB is provided to the gate of 



P6402 



14 



transistor 1028. Signals PSUM1 and CT are logically combined 
in logic gate 1210 and provided to the gate of transistor 1207 
as SUM1, and signals PSUMO and CTB are logically combined in 
logic gate 1211 and provided to the gate of transistor 1209 as 
SUMO. Clock (CLK) and an enable signal (EN) are logically 
combined in NAND gate 12 02 and provided to the gate of 
transistor 1201 via delay elements 1203-1205 to pull the 
output to logic high when CLK is in a logic low state. When 
CLK is logic high, the logic state of S is determined by 
signals J, JB, CT, CTB, PSUM1, and PSUMO. Delay elements 1203- 
12 05 provide buffering and delay matching, and may be 
eliminated in some embodiments. 

It will be appreciated that other circuit configurations 
may be used to implement the logic functions of sum generator 
1100. For example, in other embodiments, carry translation 
signals CT may be logically ANDed with corresponding 
accumulated carry-create signals J to generate well-known 
accumulated carry-generate signals G, which in turn may be 
supplied to a conventional sum circuit (e.g., sum circuit 500 
of FIG. 5) to select between suml and sumO bits. However, it 
is to be appreciated that the implementation shown in FIG. 11 
does not introduce additional delay in generating the sum bits 
as compared to prior art sum circuits. Specifically, because 
translation logic 1104 provides the SUM1 and SUMO signals 
before J is generated by CLA logic 900 (see also FIG. 6) , the 
gate delays associated with carry-ripple adders in sum circuit 
1102 and logic in translation logic 1104 are overlapped with 
gate delays associated in generating J, and therefore do not 
affect the performance of adder 600. In contrast, embodiments 
that combine the carry translation and accumulated carry- 
create signals to produce accumulated carry-generate signals 
introduce additional delay into the critical path of J, and 
are therefore less desirable. 
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As discussed above, adder 60 0 may be faster than 
conventional CLA adders such as, for example, adder 100 of 
FIG. 1, because the first stage logic circuits (e.g., J3 
circuit 700 and T3 circuit 800) of adder 600 have lower stack 
heights than first stage logic circuits (e.g., P3 circuit 200 
and G3 circuit 300) of conventional CLA adders. Referring 
again to FIGS. 7 and 8, each discharge path of J3 circuit 700 
includes two stacked input transistors coupled between nodes 
Nl and N2 , and thus J3 circuit 700 has a stack height of two, 
and each discharge path of T3 circuit 800 includes three 
stacked input transistors coupled between nodes Nl and N2, and 
thus T3 circuit 800 has a stack height of three. In contrast, 
prior art P3 circuit 200 of FIG. 2 has a stack height of 
three, and prior art G3 circuit 300 of FIG. 3 has a stack 
height of four. Accordingly, because the first stage carry- 
create and carry- transmit circuits of present embodiments have 
lower stack heights than prior art first stage carry-propagate 
and carry-generate circuits, respectively, adders configured 
in accordance with the present invention may be faster than 
conventional CLA adders that employ prior art carry-propagate 
and carry-generate logic circuits. 

Further, in contrast to the prior art, Applicant's first 
stage logic circuits have evenly balanced discharge paths. 
Referring again to FIGS. 7 and 8, each discharge path in J3 
carry-create circuit 700 includes two stacked input 
transistors, and each discharge path in T3 circuit 800 
includes three stacked input transistors. By comparison, prior 
art carry-generate circuit 300 of FIG. 3 includes one 
discharge path having four stacked input transistors, a second 
discharge path having three stacked input transistors, and a 
third discharge path having two stacked input transistors, and 
therefore, as discussed above, requires re-sizing of its input 
transistors to maintain balanced operation. Thus, because 
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Applicant's carry-create circuit 700 and carry-transmit 
circuit 800 each have balanced discharge paths, transistor 
sizing modifications that compensate for different drive 
strengths are not necessary. 

While particular embodiments of the present invention 
have been shown and described, it will be obvious to those 
skilled in the art that changes and modifications may be made 
without departing from this invention in its broader aspects 
and, therefore, the appended claims are to encompass within 
their scope all such changes and modifications as fall within 
the true spirit and scope of this invention. 
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